Final Project - Final Deliverable

Author

Tarun Tamilselvan, Paras Gautam, Abel Kidane, Romesh Mamidi

Published

December 6, 2024

Introduction:

With electric vehicles (EVs) becoming increasingly popular, there’s a growing interest in how the surge in EVs might impact public safety on American roads. This project seeks to analyze whether the exponential increase in EVs aligns with safer roads or introduces new safety challenges for drivers and pedestrians alike. As we see EV adoption being celebrated for its environmental and cost benefits, a closer look at safety metrics offers a unique angle on the unintended consequences of this rapid technological shift. By exploring the connection between EV growth and trends in motor vehicle crashes, we aim to tell a story that digs into the real, data-backed impacts of EV adoption on road safety across the United States.

Investigation: Engage readers with an important question: Are we trading one set of benefits for unforeseen risks? Thus, we are considering several factors to address this question. One guiding theory we are examining is: Do EVs have different crash rates or types of accidents compared to traditional internal combustion engine (ICE) vehicles? By analyzing crash rates in relation to EV growth, we can explore whether EVs themselves (due to weight, acceleration characteristics, or other design factors) correlate with Public Awareness and Adaptation.

As a new technology, EVs bring features that may impact driver behavior and awareness—like rapid acceleration and quieter engines—which could influence accident types and rates. This angle explores how driving an EV changes driver and pedestrian experiences and habits on the road. Our primary audience includes policymakers, transportation authorities, and general readers interested in road safety, technology, and environmental sustainability. Policymakers and infrastructure planners will find these insights valuable in addressing safety needs as the EV population grows.

(Pre-Program Arrangements) Loading Libraries & Data

library("dplyr")

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library("stringr")
library("ggplot2")
library("plotly")

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout
## Data from https://data.wa.gov/Demographics/Most-Common-New-Electric-Vehicles-by-Model/yu7n-qgtd

electric_vehicle_df <-
  read.csv("Electric_Vehicle_Title_and_Registration_Activity_20241120.csv")

## Data from https://www.nhtsa.gov/data/crash-data-systems
crash_data_df <-
  read.csv("NHTSA_Crash_Data.csv")

Glimpse

# EV Dataset
glimpse(electric_vehicle_df)
Rows: 1,048,575
Columns: 16
$ Clean.Alternative.Fuel.Vehicle.Type <fct> Battery Electric Vehicle (BEV), Ba…
$ VIN..1.10.                          <fct> 1N4BZ0CP3G, 5YJ3E1EB7K, 5YJ3E1EB7K…
$ DOL.Vehicle.ID                      <int> 348273537, 244708467, 244708467, 4…
$ Model.Year                          <int> 2016, 2019, 2019, 2019, 2023, 2019…
$ Make                                <fct> NISSAN, TESLA, TESLA, NISSAN, VOLV…
$ Model                               <fct> Leaf, Model 3, Model 3, Leaf, XC90…
$ Sale.Price                          <int> 0, 0, 58100, 0, 0, 22491, 0, 0, 0,…
$ Sale.Date                           <fct> March 20 2024, , February 06 2019,…
$ Base.MSRP                           <int> 0, 0, 0, 0, 0, 0, 0, 69900, 69900,…
$ Transaction.Type                    <fct> Transfer Title, Original Registrat…
$ Transaction.Date                    <fct> March 27 2024, February 28 2019, F…
$ Year                                <int> 2024, 2019, 2019, 2023, 2024, 2023…
$ County                              <fct> Jefferson, King, King, King, King,…
$ City                                <fct> PORT TOWNSEND, BELLEVUE, BELLEVUE,…
$ State                               <fct> WA, WA, WA, WA, WA, WA, WA, WA, WA…
$ Postal.Code                         <fct> 98368, 98007, 98007, 98027, 98027,…
summary(electric_vehicle_df)
                     Clean.Alternative.Fuel.Vehicle.Type      VIN..1.10.     
 Battery Electric Vehicle (BEV)        :790976           7SAYGDEE7P:   3282  
 Hydrogen Powered Vehicle              :    19           7SAYGDEE6P:   3226  
 Plug-in Hybrid Electric Vehicle (PHEV):257580           7SAYGDEEXP:   3149  
                                                         1N4AZ0CP6D:   3127  
                                                         7SAYGDEE5P:   3126  
                                                         7SAYGDEE8P:   3106  
                                                         (Other)   :1029559  
 DOL.Vehicle.ID        Model.Year          Make            Model       
 Min.   :        4   Min.   :1993   TESLA    :417880   Model 3:165772  
 1st Qu.:156712559   1st Qu.:2016   NISSAN   :153474   Leaf   :150900  
 Median :214818782   Median :2019   CHEVROLET: 98924   Model Y:142761  
 Mean   :214479764   Mean   :2019   FORD     : 61000   Model S: 70597  
 3rd Qu.:258208569   3rd Qu.:2022   BMW      : 48078   Volt   : 52349  
 Max.   :479254772   Max.   :2025   TOYOTA   : 40947   Model X: 35838  
                                    (Other)  :228272   (Other):430358  
   Sale.Price                   Sale.Date        Base.MSRP     
 Min.   :       0                    :753951   Min.   :     0  
 1st Qu.:       0   September 30 2023:   472   1st Qu.:     0  
 Median :       0   March 30 2024    :   457   Median :     0  
 Mean   :   11208   June 30 2023     :   453   Mean   :  2277  
 3rd Qu.:       0   December 29 2023 :   449   3rd Qu.:     0  
 Max.   :12312016   December 30 2023 :   433   Max.   :845000  
                    (Other)          :292360   NA's   :10      
                         Transaction.Type           Transaction.Date  
 Original Registration           :231344   May 01 2024      :   2754  
 Original Title                  :226660   July 31 2024     :   2518  
 Registration at time of Transfer: 46244   September 12 2024:   2329  
 Registration Renewal            :469766   September 03 2024:   2048  
 Registration Replacement        : 17023   July 09 2024     :   2022  
 Temporary Registration          :  8840   June 03 2024     :   1851  
 Transfer Title                  : 48698   (Other)          :1035053  
      Year            County              City            State        
 Min.   :2010   King     :582169   SEATTLE  :194487   WA     :1045352  
 1st Qu.:2020   Snohomish:109731   BELLEVUE : 60512   CA     :    758  
 Median :2022   Pierce   : 71941   REDMOND  : 41252   VA     :    328  
 Mean   :2021   Clark    : 64426   VANCOUVER: 39512   TX     :    253  
 3rd Qu.:2023   Kitsap   : 38073   KIRKLAND : 35163   OR     :    232  
 Max.   :2024   Thurston : 35928   SAMMAMISH: 33455   MD     :    145  
                (Other)  :146307   (Other)  :644194   (Other):   1507  
  Postal.Code    
 98052  : 30339  
 98033  : 19859  
 98004  : 19409  
 98115  : 18567  
 98006  : 18542  
 98012  : 17565  
 (Other):924294  
nrow(electric_vehicle_df)
[1] 1048575
# Crash Dataset
glimpse(crash_data_df)
Rows: 10,000
Columns: 29
$ State               <fct> SD, AL, MS, NJ, OH, AL, NY, MD, NY, OH, KY, KS, OR…
$ County              <fct> County 15, County 98, County 74, County 49, County…
$ Crash_Date          <fct> 2/21/20, 11/22/20, 1/28/21, 7/6/21, 4/12/23, 8/22/…
$ Crash_Time          <fct> 23:17, 22:27, 22:04, 20:44, 13:57, 21:56, 11:56, 2…
$ Day_of_Week         <fct> Tuesday, Wednesday, Monday, Friday, Friday, Friday…
$ Location_Type       <fct> Rural, Highway, Rural, Rural, Residential, Highway…
$ Weather_Conditions  <fct> Rain, Rain, Snow, Snow, Snow, Snow, Rain, Clear, R…
$ Road_Condition      <fct> Dry, Wet, Dry, Dry, Wet, Dry, Dry, Wet, Gravel, Dr…
$ Number_of_Vehicles  <int> 5, 3, 2, 2, 2, 3, 2, 2, 4, 4, 3, 4, 1, 3, 3, 2, 5,…
$ Vehicle_Type_1      <fct> Sedan, Sedan, Sedan, Sedan, EV, Motorcycle, EV, Se…
$ Vehicle_Type_2      <fct> EV, Sedan, Motorcycle, Truck, Motorcycle, SUV, Sed…
$ EV_Indicator_1      <fct> No, No, No, Yes, Yes, Yes, Yes, No, No, Yes, No, Y…
$ EV_Indicator_2      <fct> Yes, No, No, No, No, No, No, No, No, No, No, Yes, …
$ Driver_Age_1        <int> 70, 64, 51, 67, 27, 74, 26, 18, 28, 22, 23, 56, 63…
$ Driver_Age_2        <dbl> 20, 28, 74, 50, 22, 16, 78, 45, 71, 17, 56, 49, NA…
$ Driver_Gender_1     <fct> Male, Female, Female, Male, Male, Female, Male, Ma…
$ Driver_Gender_2     <fct> Male, Female, Male, Male, Male, Male, Male, Male, …
$ Crash_Severity      <fct> Severe Injury, Minor Injury, Minor Injury, Minor I…
$ Crash_Cause         <fct> DUI, Speeding, Distracted Driving, DUI, Fatigue, F…
$ Fatalities          <int> 0, 0, 0, 0, 0, 2, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Injuries            <int> 8, 0, 0, 0, 6, 10, 7, 0, 0, 7, 0, 6, 8, 0, 0, 0, 4…
$ Alcohol_Involvement <fct> Yes, No, Yes, No, Yes, Yes, Yes, Yes, Yes, No, Yes…
$ Speeding            <fct> Yes, Yes, No, No, No, Yes, No, Yes, Yes, Yes, Yes,…
$ Police_Report_Filed <fct> No, No, Yes, No, No, No, No, Yes, Yes, No, Yes, No…
$ Vehicle_Make_1      <fct> Chevrolet, Tesla, Kia, Ford, Kia, Ford, Chevrolet,…
$ Vehicle_Make_2      <fct> Audi, Hyundai, Ford, Honda, Audi, Hyundai, Hyundai…
$ Year_of_Vehicle_1   <int> 2018, 2009, 2017, 2004, 2016, 2016, 2023, 2016, 20…
$ Year_of_Vehicle_2   <dbl> 2008, 2020, 2023, 2007, 2008, 2000, 2022, 2007, 20…
$ Crash_Year          <int> 2020, 2020, 2021, 2021, 2023, 2020, 2020, 2023, 20…
summary(crash_data_df)
     State            County        Crash_Date     Crash_Time  
 WA     : 231   County 90: 124   1/10/22 :  16   5:10   :  19  
 ID     : 230   County 72: 122   10/5/21 :  15   19:34  :  17  
 NM     : 222   County 94: 121   11/19/22:  14   22:06  :  17  
 DE     : 220   County 55: 120   12/21/21:  14   8:51   :  17  
 PA     : 215   County 56: 120   3/21/21 :  14   16:31  :  15  
 CT     : 214   County 36: 116   7/11/20 :  14   2:52   :  15  
 (Other):8668   (Other)  :9277   (Other) :9913   (Other):9900  
    Day_of_Week       Location_Type  Weather_Conditions
 Friday   :1426   Highway    :2479   Clear:2033        
 Monday   :1471   Residential:2474   Fog  :1981        
 Saturday :1334   Rural      :2483   Rain :2018        
 Sunday   :1459   Urban      :2564   Snow :2005        
 Thursday :1385                      Windy:1963        
 Tuesday  :1451                                        
 Wednesday:1474                                        
           Road_Condition Number_of_Vehicles    Vehicle_Type_1
 Construction Zone:2048   Min.   :1.000      EV        :1943  
 Dry              :1994   1st Qu.:2.000      Motorcycle:2006  
 Gravel           :2000   Median :3.000      Sedan     :2015  
 Icy              :1987   Mean   :3.006      SUV       :1994  
 Wet              :1971   3rd Qu.:4.000      Truck     :2042  
                          Max.   :5.000                       
                                                              
    Vehicle_Type_2 EV_Indicator_1 EV_Indicator_2  Driver_Age_1  
           :1967   No :6232          :1967       Min.   :16.00  
 EV        :1647   Yes:3768       No :6386       1st Qu.:32.00  
 Motorcycle:1609                  Yes:1647       Median :48.00  
 Sedan     :1619                                 Mean   :47.99  
 SUV       :1615                                 3rd Qu.:64.00  
 Truck     :1543                                 Max.   :80.00  
                                                                
  Driver_Age_2   Driver_Gender_1 Driver_Gender_2       Crash_Severity
 Min.   :16.00   Female:4900           :1967     Fatal        :2523  
 1st Qu.:32.00   Male  :5100     Female:4047     Minor Injury :2453  
 Median :48.00                   Male  :3986     No Injury    :2565  
 Mean   :47.93                                   Severe Injury:2459  
 3rd Qu.:64.00                                                       
 Max.   :80.00                                                       
 NA's   :1967                                                        
             Crash_Cause     Fatalities        Injuries     
 Distracted Driving:1648   Min.   :0.0000   Min.   : 0.000  
 DUI               :1654   1st Qu.:0.0000   1st Qu.: 0.000  
 Fatigue           :1678   Median :0.0000   Median : 0.000  
 Mechanical Failure:1704   Mean   :0.6373   Mean   : 2.513  
 Speeding          :1702   3rd Qu.:0.0000   3rd Qu.: 5.000  
 Weather           :1614   Max.   :5.0000   Max.   :10.000  
                                                            
 Alcohol_Involvement Speeding   Police_Report_Filed   Vehicle_Make_1
 No :4987            No :5011   No :4992            BMW      :1041  
 Yes:5013            Yes:4989   Yes:5008            Chevrolet:1018  
                                                    Audi     :1012  
                                                    Ford     :1004  
                                                    Hyundai  :1001  
                                                    Kia      : 996  
                                                    (Other)  :3928  
   Vehicle_Make_2 Year_of_Vehicle_1 Year_of_Vehicle_2   Crash_Year  
          :1967   Min.   :2000      Min.   :2000      Min.   :2020  
 Honda    : 824   1st Qu.:2006      1st Qu.:2005      1st Qu.:2020  
 Kia      : 821   Median :2011      Median :2011      Median :2022  
 Chevrolet: 813   Mean   :2012      Mean   :2011      Mean   :2022  
 Hyundai  : 811   3rd Qu.:2017      3rd Qu.:2017      3rd Qu.:2022  
 Audi     : 808   Max.   :2023      Max.   :2023      Max.   :2023  
 (Other)  :3956                     NA's   :1967                    
nrow(crash_data_df)
[1] 10000

Interactive Map #1

# clean and prepare the data
ev_trend <- electric_vehicle_df %>%
  filter(!is.na(Year) & Year > 2000) %>%
  mutate(
    Is_Tesla = ifelse(grepl("TESLA", Make, ignore.case = TRUE), 1, 0)
  ) %>%
  group_by(Year) %>%
  summarise(
    Total_EV_Purchases = n(),
    Tesla_Purchases = sum(Is_Tesla, na.rm = TRUE)
  ) %>%
  ungroup()

# interactive line plot
ev_trend_plot <- plot_ly(
  data = ev_trend,
  x = ~Year,
  y = ~Total_EV_Purchases,
  type = 'scatter',
  mode = 'lines + markers',
  name = 'Total EV Purchases',
  text = ~paste(
    "Year: ", Year,
    "<br>Total EV Purchases: ", Total_EV_Purchases,
    "<br>Tesla Purchases: ", Tesla_Purchases
  ),
  hoverinfo = 'text',
  line = list(color = 'blue')
) %>%
  add_trace(
    y = ~Tesla_Purchases,
    name = 'Tesla Purchases',
    mode = 'lines+markers',
    line = list(color = 'red')
  ) %>%
  layout(
    title = "Linear Progression of EV Purchases by Year",
    xaxis = list(title = "Year"),
    yaxis = list(title = "Number of EV Purchases"),
    legend = list(title = list(text = "EV Types"))
  )

ev_trend_plot

Explanation of Interactive Visualization (#1)

As depicted in the graph above which illustrates the increase in both Tesla and overall EV purchases in WA, it is apparent that the increase in the purchases on Telsa is positively correlated with the increase in the overall purchases of EVs. Therefore, this indicates that Tesla plays a major role in the overall consumption of EV cars.

Interactive Map #2

# clean and prepare the data
wa_crash_data <- crash_data_df %>%
  filter(State == "WA") %>%
  mutate(
    EV_Involvement = case_when(
      Vehicle_Type_1 == "EV" | Vehicle_Type_2 == "EV" ~ "EV",
      TRUE ~ "Non-EV"
    )
  )

wa_crash_summary <- wa_crash_data %>%
  group_by(EV_Involvement, Crash_Cause) %>%
  summarise(
    total_crashes = n(),
    fatalities = sum(Fatalities, na.rm = TRUE),
    injuries = sum(Injuries, na.rm = TRUE)
  ) %>%
  ungroup()
`summarise()` has grouped output by 'EV_Involvement'. You can override using
the `.groups` argument.
# interactive bar chart
wa_interactive_crash_plot <- plot_ly(
  data = wa_crash_summary,
  x = ~EV_Involvement,
  y = ~total_crashes,
  color = ~Crash_Cause,
  type = "bar",
  text = ~paste(
    "Crash Cause: ", Crash_Cause,
    "<br>Total Crashes: ", total_crashes,
    "<br>Fatalities: ", fatalities,
    "<br>Injuries: ", injuries
  ),
  hoverinfo = "text"
) %>%
  layout(
    title = "Comparison of EV vs. Non-EV Crashes by Cause in WA State",
    xaxis = list(title = "Vehicle Type"),
    yaxis = list(title = "Number of Crashes"),
    barmode = "stack"
  )

wa_interactive_crash_plot

Explanation of Interactive Visualization (#2)

As depicted in the graph above which illustrates the cause that is best associated with both EV and non-EV crashes in the United States, it is apparent that non-EVs have resulted in significantly more car crashes than EVs.

Interactive Visualization (#3)

ev_trend_combined <- electric_vehicle_df %>%
  filter(!is.na(Year) & Year > 2000) %>%
  group_by(Year) %>%
  summarise(
    Total_EV_Purchases = n(),
    Tesla_Purchases = sum(grepl("TESLA", Make, ignore.case = TRUE), na.rm = TRUE),
    .groups = "drop"
  )

crash_trend_combined <- crash_data_df %>%
  filter(!is.na(Crash_Year)) %>%
  mutate(
    EV_Related = ifelse(
      grepl("EV", Vehicle_Type_1, ignore.case = TRUE) | 
      grepl("EV", Vehicle_Type_2, ignore.case = TRUE), 
      1, 0
    )
  ) %>%
  group_by(Crash_Year) %>%
  summarise(
    EV_Crashes = sum(EV_Related, na.rm = TRUE),
    Total_Crashes = n(),
    Fatalities = sum(Fatalities, na.rm = TRUE),
    Injuries = sum(Injuries, na.rm = TRUE),
    .groups = "drop"
  )

# ev purchases + crash per/yr
combined_data <- ev_trend_combined %>%
  full_join(crash_trend_combined, by = c("Year" = "Crash_Year")) %>%
  mutate(Year = as.integer(Year))  # Convert to integer if it's not
all_years <- data.frame(Year = seq(min(combined_data$Year, na.rm = TRUE),
                                  max(combined_data$Year, na.rm = TRUE), by = 1))

combined_data <- left_join(all_years, combined_data, by = "Year")
combined_data[is.na(combined_data)] <- 0

# interactive plot
combined_plot <- plot_ly() %>%
  add_trace(
    data = combined_data,
    x = ~Year,
    y = ~Total_EV_Purchases,
    type = "scatter",
    mode = "lines+markers",
    name = "Total EV Purchases",
    line = list(color = "blue"),
    text = ~paste("Year: ", Year, "<br>Total EV Purchases: ", Total_EV_Purchases),
    hoverinfo = "text"
  ) %>%
  add_trace(
    data = combined_data,
    x = ~Year,
    y = ~EV_Crashes,
    type = "scatter",
    mode = "lines+markers",
    name = "EV-Related Crashes",
    line = list(color = "red"),
    text = ~paste("Year: ", Year, "<br>EV-Related Crashes: ", EV_Crashes),
    hoverinfo = "text"
  ) %>%
  add_trace(
    data = combined_data,
    x = ~Year,
    y = ~Tesla_Purchases,
    type = "scatter",
    mode = "lines+markers",
    name = "Tesla Purchases",
    line = list(color = "green"),
    text = ~paste("Year: ", Year, "<br>Tesla Purchases: ", Tesla_Purchases),
    hoverinfo = "text"
  ) %>%
  layout(
    title = "EV Purchases vs. EV-Related Crashes (Actual Numbers)",
    xaxis = list(title = "Year"),
    yaxis = list(title = "Counts (Purchases/Crashes)", side = "left"),
    legend = list(title = list(text = "Legend")),
    hovermode = "closest"
  )

# Render the plot
combined_plot

Explanation of Interactive Visualization (#3)

As depicted in the graph above which illustrates the increase in total EV purchases, EV-related crashes, and Tesla Purchases, it is apparent that there is a steady decline in EV-related crashes as there is a steady increase in the number of EV purchases. This, therefore, concludes that the increase in EV cars on the road has led to a decrease in car crashes. A point to make - the main reason as to why the EV-Related crashes are presented low in the graph is due to the fact that they have low amount of crashes; maybe because they’re relatively new to the market?

Conclusion

The first takeaway is the overall increase in the total number of crashes. The data clearly shows an upward trend in motor vehicle crashes, which could be attributed to factors such as the rising number of vehicles on the road and the growing driving population. As more vehicles are used, it’s expected that the number of accidents would naturally increase.

A second important takeaway is the rise in crashes involving electric vehicles. While the number of total crashes is rising, the rate at which EV-related crashes occur appears to be growing at a different pace. This trend could be linked to the increasing number of electric vehicles on the road, as their adoption continues to accelerate. The data visualizations effectively highlight this difference, showing how EV-related crashes are becoming more prevalent as the EV population expands.

Lastly, the project underscores the difference in crash rates between EVs and non-EVs. Despite the rise in EV-related crashes, non-electric vehicles are still involved in more crashes overall. This is primarily because electric vehicles are a relatively new addition to the roads, and their numbers are still catching up. However, as the number of EVs continues to grow, the rate of EV-related crashes is expected to increase as well, closely mirroring the trend in overall vehicle crashes. These insights emphasize the evolving landscape of road safety as electric vehicles become more widespread.